[Codegen] Use DMA for LHS/RHS only in scaled matmul#23760
Draft
lialan wants to merge 3 commits intousers/lialan/subbyte_gather_to_ldsfrom
Draft
[Codegen] Use DMA for LHS/RHS only in scaled matmul#23760lialan wants to merge 3 commits intousers/lialan/subbyte_gather_to_ldsfrom
lialan wants to merge 3 commits intousers/lialan/subbyte_gather_to_ldsfrom
Conversation
d3c3f1d to
f654410
Compare
* For now, remove the blanket guard that disabled DMA for all scaled matmuls. * When manually enable DMA, XOR swizzle will get disabled (for now). * Use DMA (UseGlobalLoadDMAAttr) for LHS/RHS operands. * Fix lowering of DMA copy.
f654410 to
88f6a9a
Compare
6e49c10 to
c1f3a75
Compare
Contributor
Author
|
Working on DMA with XOR Swizzle support in a subsequent PR. |
Revert destination indices from divergent (srcLinearOffset) back to subgroup-uniform (linearOffsetVal). The gather_to_lds op contract specifies that only lane 0's dstIndices are used, so the dst base must be uniform. Also add a TODO in the scaled matmul DMA pipeline test noting that gather_to_lds is not yet produced for scaled operands. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Step 2 of enabling DMA for scaled GEMMs. This patch will enable DMA for scaled GEMM, but will disable XOR swizzle at the same time.